Application of sulfur SAD to small crystals with a large asymmetric unit and anomalous substructure

Multiple approaches to data merging and scaling and anomalous substructure determination were used to solve the structure of a large asymmetric unit from weakly diffracting crystals by sulfur SAD.


Introduction
Despite recent advances in synchrotron hardware, datacollection strategies and crystallographic software packages, the de novo phasing of macromolecular crystal structures by sulfur single-wavelength anomalous dispersion (S-SAD) from native sulfur atoms can be challenging (Liu & Hendrickson, 2015, 2017Rose et al., 2015;Terwilliger et al., 2016;Olieric et al., 2016;Akey et al., 2016;Weiss, 2017). Here, we describe strategies to determine the anomalously scattering sulfur substructure of Ric-8A, which was a necessary step towards the solution of its structure by SAD. In a previous publication we described the structure of Ric-8A, but only briefly the methods by which it was determined (Zeng et al., 2019).
The weak anomalous scattering signal (Bijvoet ratio = $1% of the total reflection intensities) from sulfur atoms in proteins limits its utility for phase determination. The anomalous signal is a function of the square root of the ratio of the number of unique reflections, and hence the resolution, to the number of atoms in the anomalous substructure . Successful application of S-SAD may require (i) the accurate collection and merging of highly redundant and isomorphous data sets with quantitatively strong signal-to-noise ratios, often using strategies tailored to individual synchrotron sites, and (ii) the finding of sulfur substructures using optimized parameters in crystallographic software packages to maximize sulfur anomalous signals (Liu & Hendrickson, 2015, 2017Olieric et al., 2016;Akey et al., 2016;Bunkó czi et al., 2015).
Several recently published reviews have addressed major advances in synchrotron hardware and crystallographic software to reduce systematic errors that obscure anomalous differences arising from sulfur substructures (Liu & Hendrickson, 2015, 2017Terwilliger et al., 2016;Olieric et al., 2016;Akey et al., 2016;Rose et al., 2015;Hendrickson, 2014). As recent generations of synchrotron beamlines have been constructed with novel hardware configurations, it may be important to design a data-collection strategy that is beamlinespecific to acquire highly precise and redundant S-SAD anomalous data sets. From crystal-harvesting loop selection to data processing, every step is crucial for collecting useful S-SAD anomalous data. However, many practices are generally applicable to all beamlines.
Currently, a common approach is to measure S-SAD anomalous data at 6000-7000 eV, at which the sulfur anomalous signal (f 00 of 0.8 e À ) is significant but absorption is low, with a fast and large-area photon-counting detector. Fastreadout photon-counting detection is especially useful when data can be collected in a shutterless and fine-sliced oscillation mode. Recently, it has been possible to measure S-SAD data at longer wavelengths near the sulfur K edge at very specialized synchrotron beamlines. One example is the I23 beamline at Diamond Light Source, UK, where evacuation of the space between the sample and detector reduces air absorption at low energy and the use of a unique semi-cylindrical, large-area, pixel-array detector affords access to diffraction at high 2 angles (Weiss, 2017;Wagner et al., 2016).
It is generally agreed that sulfur anomalous signals can be boosted by collecting high-multiplicity data sets within an optimal resolution range. However, it may not be possible to obtain such data sets from a very small crystal that is subject to radiation damage or belongs to a low-symmetry space group (Klinke et al., 2015). In such cases, it may be possible to merge data sets collected from multiple isomorphous crystals. On the other hand, several approaches to mitigate radiation damage have been adopted in synchrotron data-collection strategies. For example, the helical data-collection mode implemented at several micro-focus beamlines allows the collection of oscillation images while translating a crystal along a defined collection path parallel to its long axis to reduce radiation damage (Polsinelli et al., 2017) If data sets from multiple crystals are required to obtain suitably accurate intensity measurements, merging and scaling thousands of frames with millions of reflections from multiple, marginally isomorphous crystals can present a challenge for S-SAD phasing. A cluster-based analysis has been used to prioritize individual unmerged data sets based on their divergence in unit-cell parameters and reflection quality, which includes the errors in measurement. This methodology has been incorporated into data-scaling and averaging programs, such as BLEND in CCP4 (Foadi et al., 2013) and phenix.scale_and_merge . After obtaining a merged data set that optimizes anomalous signals at the highest resolution, locating the atoms of the sulfur substructure appears to be the most challenging task in solving the S-SAD phasing problem. The dual-space direct method implemented in SHELXD (Sheldrick, 2010) refines substructure positions and phases by recurrently alternating reciprocalspace phase refinement and density modification. The phenix.hyss tool (Grosse-Kunstleve & Adams, 2003;Bunkó czi et al., 2015) uses both dual-space substructure completion and a correlation-based scoring procedure to find the anomalously scattering substructure, as well as SAD likelihood functionbased gradient maps to complete partial substructures from the anomalous difference Patterson function and the same function to evaluate potential solutions. More recently, PRASA was introduced to implement a relaxed averaged alternating reflections phase-retrieval algorithm to extract the positions of anomalous scatterers from the anomalous difference data (Skubá k, 2018).
Many of the hundreds of structures that have been solved by S-SAD were determined from crystals with relatively small (<25 kDa) asymmetric units that diffract to d-spacings beyond 2 Å , factors that are associated with strong diffraction and small sulfur substructures (Rose et al., 2015;Gorgel et al., 2015). For moderately or weakly diffracting crystals with large asymmetric units and sulfur substructures, which may also be subject to significant radiation decay, it becomes essential to combine all possible optimization methods to acquire accurate and highly redundant data sets that afford quantitation of anomalous differences. Moderate-to-weakly diffracting crystals require much longer exposure times to obtain anomalous data with sufficient redundancy to afford acceptable signal-tonoise ratios. One successful example was reported by Smith and coworkers, in which data sets from 28 crystals were merged to obtain the accurate sulfur anomalous signal at 4 Å resolution required to determine the flavivirus NS1 structure to 3 Å resolution ; another described the extraction of a sulfur substructure using data to 7 Å resolution from 32 crystals with phase extension to 3.5 Å resolution (El Omari et al., 2014). Data collection from several crystals each in multiple orientations has also proven useful (Olieric et al., 2016). Here, we summarize our experience with S-SAD phasing in such a case: to determine the structure of an asymmetric unit containing two 51 kDa domains of the protein Ric-8A using the brilliant, micro-focused beam at the tunable research papers Frontier micro-focusing Macromolecular Crystallography (FMX) beamline at National Synchrotron Light Source II (NSLS II; Schneider et al., 2021) Ric-8A plays essential roles in cells as both a chaperone and a guanine nucleotide-exchange factor for subunits of heterotrimeric G proteins (Tall et al., 2003;Thomas et al., 2011;Chan et al., 2013). Since there are no homologs of Ric-8A with known structure, molecular-replacement phasing was not an option at the time that the structure was determined. Further, selenomethionine-derivatized Ric-8A only forms microcrystals, and native Ric-8A crystals are highly sensitive to heavy metals. Thus, we sought to obtain phase information from the anomalous signals arising from the sulfur atoms in the nine cysteine and ten methionine residues in each molecule of Ric-8A, which account for 4.2% of the 904 residues in the asymmetric unit. None of the cysteine residues is involved in a disulfide bond. While the best native crystals diffract to 2.5 Å resolution using X-rays of wavelength 1.77 Å , we show that sulfur anomalous data can only be accurately measured to 3-3.4 Å resolution. A survey of the PDB reveals that Ric-8A represents one of about ten crystal structures comprising more than 100 kDa per asymmetric unit (two molecules per asymmetric unit, total 102.2 kDa) that have been determined by native anomalous scattering from light atoms (Z < 15). We discuss data-collection strategies using the high-precision crystal-positioning hardware and control software at the FMX beamline, and the merging and scaling procedures that led to successful structure determination of Ric-8A by S-SAD.

Crystallization of Ric-8A
The expression and purification of the N-terminally phosphorylated 452-residue G binding domain of rat Ric-8A has been described elsewhere (Zeng et al., 2019). Briefly, initial crystallization experiments were performed using a Gryphon robot (Art Robbins, California, USA) to screen over 1000 conditions using commercially available kits by mixing equal amounts of reservoir solution with phosphorylated or unphosphorylated Ric-8A protein at concentrations as high as 75 mg ml À1 . Small needle-like crystals were observed in conditions from The PEGs II Suite (Qiagen) at 20 C after 72 h. The reservoir solutions from the initial hits consisted of 0.2 M lithium sulfate, 0.1 M Tris buffer pH 8.0 and 25-30% PEG 4000 or PEG 5000 MME. Crystal quality was improved by using a 3:1 ratio of phosphorylated Ric-8A protein and reservoir solutions consisting of 0.2 M lithium sulfate, 0.1 M Tris or HEPES buffers pH 7-9 and 20-30% PEG 3350. The larger Ric-8A crystals, measuring 50-250 mm in the longest dimension and 5-20 mm in cross section ( Fig. 1), were observed after 2-3 weeks of incubation time. Prior to mounting, crystals were harvested in a cryoprotection solution containing 20-25%(v/v) PEG 400 or oil-based cryoprotectant (Paratone-N) and then rapidly plunged into liquid nitrogen.

Crystal mounting
To minimize systematic errors from sample vibration during data collection, we used a 20 mm nylon crystal-mounting CryoLoop (Hampton Research) to harvest Ric-8A crystals. The thick nylon loops decrease the mechanical vibration from exposure to the N 2 cryostream, which improves data quality, especially when data are derived from merging multiple data sets. The sample loop was mounted on a goniometer with cryocooling capability to minimize radiation damage (Garman & Owen, 2006;Teng & Moffat, 2000).

Data collection
18 data sets were recorded at 100 K and a wavelength of 1.7712 Å (7000 eV) using the helical data-collection method (Polsinelli et al., 2017) on the micro-focusing FMX beamline at NSLS II equipped with an EIGER 16M pixel-array detector with a 133 Hz framing rate. The crystal-to-detector distance was set to 200, 175 or 150 mm according to the highest d-spacings at which diffraction was observed, affording the collection of data at resolutions ranging from 2.67 to 2.23 Å at the detector edge. Crystals were irradiated with a 10 Â 10 mm beam at 10% attenuation of a flux of $5.0 Â 10 12 photons s À1 in a helium flight path. Data were collected with a thin-slice oscillation range (0.1-0.2 per image) at 0.1-0.2 s exposure per image for a total rotation of 360-5760 about the ' axis per data set (Table 1). The a* axis was inclined 3-15 to the ' axis for ten of the 18 data sets and within a 20-50 angle to ' for the remaining eight. Over all 18 crystals, more than 23 500 of data were measured. The data sets were processed by XDS (Kabsch, 2010) in space group P1. Analysis of the data using POINTLESS (Evans & Murshudov, 2013) in the CCP4 software package (Winn et al., 2011) confirmed the space-group assignment as P2 1 2 1 2 1 . Each of the 18 unmerged data sets produced by XDS has been deposited in the PDB in CCP4 mtz format, associated with PDB entry 6mng, with the filename X-XDS.mtz, where X is the data-set name shown in column 1 of Table 1.

Data reduction, phase determination and model building
Parameters describing the 18 data sets obtained from 14 crystals (Table 1) were computed using AIMLESS (Evans & Murshudov, 2013) in the CCP4 software package, the phenix.anomalous_signal tool and SHELXC (Sheldrick, 2010). The BLEND suite (Foadi et al., 2013) was used to cluster data sets and scale them using AIMLESS. The phenix.scale_and_merge and phenix_scale_anomalous_signal tools  in the Phenix program suite (Liebschner et al., 2019) were also used to scale data sets clustered using BLEND and the cluster comprised of data from all crystals. In phenix.scale_and_merge, the data-selection parameter minimum_datafile_fraction was set to accept any data set containing at least 5% of the number of observations in the largest data set (the default value is 30%). SHELXC/D/E (Sheldrick, 2010), executed though the HKL2MAP graphical interface (Pape & Schneider, 2004) was used to determine the sulfur substructure from the merged data sets. The Phenix submodule HySS (Bunkó czi et al., 2015) was also used to find atoms in the sulfur substructure. The phenix.emma program was used to correlate the candidate sulfur substructures generated using HySS and SHELXD with sulfur positions in the refined model of Ric-8A. The anomalously scattering sulfur substructure and crystallographic phases were refined using the phenix_autosol procedure (Terwilliger et al., 2009) with optimization of the positions of sulfur atoms. The twofold noncrystallographic symmetry (NCS) operator was calculated from the sulfur sites during phase refinement. Phases were extended to 2.2 Å resolution with a native data set that was measured using X-rays at a wavelength of 0.979 Å as described by Zeng et al. (2019) and was used to construct a partial model using the AutoBuild wizard (Terwilliger et al., 2008). Fragments of additional main chains were constructed after iterative manual model rebuilding and refinement with the phenix.refine tool (Afonine et al., 2012). The final refinement statistics are recorded with the description of the crystal structure (Zeng et al., 2019).

Crystallization and crystal harvesting
The largest crystals of phosphorylated Ric-8A, which measured 50-250 mm along the unit-cell a axis and 5-20 mm in cross section (Fig. 1), were observed after 2-3 weeks of incubation time. Crystals of Ric-8A phosphorylated at Ser435 and Thr440 were larger in both length and cross section than those of unphosphorylated Ric-8A. Several cryoprotectants were tested. Crystals were harvested either with a PEG-based cryoprotectant (reservoir solution + 20% PEG 400) or an oilbased cryoprotectant (Paratone-N). We found that Ric-8A  Table 1 Scaling parameters for single Ric-8A data sets.
Values in parentheses are for the highest resolution shell.  (hkl) is the ith observation of the intensity of reflection hkl and hI(hkl)i is the mean over n observations. § Statistics were generated using phenix.anomalous_signal. CC ano = hÁ ano Á ano,obs i/(hÁ 2 ano i 1/2 hÁ 2 ano,obs i 1/2 ), where Á ano is the ideal anomalous and Á ano,obs is the measured anomalous difference (F + À F À ). CC ano 1/2 is the anomalous correlation coefficient between half data sets. } |Á ano |/ Á ano values were computed using SHELXC (d 00 /sig). crystals were sensitive to glycerol and sugar-based cryoprotectants. The diffraction quality of the crystals deteriorated during storage in liquid nitrogen. Furthermore, all crystals dissolved in salt-based cryo-solutions. These observations were consistent with previous findings that penetrating cryoprotectants can increase the crystal mosaicity by displacing or replacing solvent in the crystal lattice (Ló pez-Jaramillo et al., 2002). We found a solution of 20-25%(v/v) PEG 400 in reservoir solution to be a suitable cryoprotectant. PEG-cryoprotected crystals were generally isomorphous and diffracted to 2.2 Å resolution at conventional synchrotron sources using $12 keV energy. The crystals remained marginally isomorphous after cryoprotection: the differences in unit-cell parameters are within 0.2-0.5% among these data sets, with mean values of a = 66.8 (0.3), b = 103.5 (0.2), c = 141.5 (0.6) Å (Table 1). We also used oil-based cryoprotectants, such as Paratone N, paraffin and Perfluoropolyether Cryo Oil (Hampton Research). In addition to their nonpenetrating properties, the oil cryoprotectants have the advantage that they reduce scattering and optical distortion during data collection (Riboldi-Tunnicliffe & Hilgenfeld, 1999). In our case, Paratone N provided excellent cryoprotection but resulted in shrinkage along all three unit-cell axes by 5-16% depending on the harvesting time (Zeng et al., 2019). The anomalous data sets described here were collected from crystals of phosphorylated Ric-8A cryoprotected in PEG 400.

Anomalous signal analysis of Ric-8A crystals
In advance of collecting diffraction data, we used the phenix.plan_sad_experiment tool  to estimate the relationship between the I/hIi of the data and the observed anomalous signal hS ano,obs i. This indicator is proportional to the 'useful' correlation coefficient between observed anomalous differences and ideal anomalous differences (CC ano ) generated by a Bayesian estimator on the basis of a diverse set of structures and data sets deposited in the PDB (Berman et al., 2000), where Á ano is the ideal anomalous difference and Á ano,obs is the measured anomalous difference: The tool was provided with the amino-acid sequence of residues 1-452 of Ric-8A, 4.2% of which are methionine and cysteine. Known or estimated parameters include the number of reflections, N refl , at the target resolution of 3.0 Å , the number of atoms that comprise the sulfur substructure, n site , and the second moment of the scattering factors of the anomalous substructure at the X-ray wavelength of 1.7712 Å , f b . For Ric-8A crystals, the maximum anomalous scattering from S atoms, f 00 , is 0.8 e À .
At the target resolution of 3.0 Å , the anomalous signal hS ano,obs i is predicted to be below 8 assuming a maximum I/(I) of 100 and an estimated CC ano of 0.56, corresponding to a 74% estimated probability of finding the anomalous sub-structure and an estimated figure of merit of phasing of 0.33. However, the probability and figure of merit are reduced to 26% and 0.27, respectively, at a target resolution of 5.0 Å . These estimates are made with the assumption that all sulfur atoms are highly ordered and fully occupied, and that the crystals do not suffer from radiation decay during data collection.
We concluded from the above analysis that assuming that the reflection data are collected accurately, as indicated by I/(I), and the atomic displacement factors of the S atoms are low, Ric-8A crystals would be expected to exhibit a measurable anomalous signal at a resolution limit of 3-3.5 Å (Tables 1, 2 and 3) depending on the choice of software used to scale and merge the 18 data sets.

Data collection and merging strategy
The aim of the S-SAD data-collection strategy is to measure a highly redundant intensity data set, affording full coverage of reciprocal space with accurate sulfur anomalous differences and minimal radiation damage. While the inverse ' datacollection mode minimizes the time interval, and thus differences in radiation-induced decay, between measurements of Friedel pairs, this strategy could not be implemented with the goniostat geometry and control software installed at the FMX beamline at the time that the data were collected. We therefore opted for very high redundancy afforded by measurement of rotation data from 14 single crystals (18 data sets) over oscillation ranges of 360-5760 in a helical data-collection mode to minimize radiation damage (Polsinelli et al., 2017; Table 1).
The resolution of each of the data sets was determined according to the criterion implemented in POINTLESS and AIMLESS (Evans & Murshudov, 2013) whereby the resolution limit is defined as that at which CC 1/2 falls below 0.3. By this measure, the diffraction limits of the 18 data sets ranged from 2.43 to 3.46 Å . This may have underestimated the resolution in data sets for which I/(I) > 2 in the highest resolution shell, where a steep fall-off in intensity and R meas was observed in many of the crystals. Analysis of the anomalous differences in the individual data sets suggested that a more conservative limit of 3.4 Å would be appropriate. At this limit, the anomalous signal, |Á ano |/Á ano (d 00 /sig), did not exceed 1.0 for any of the data sets (Table 1). 14 of the 18 individual data sets exhibited poor CC ano values that were not indicative of useful anomalous phasing power (Table 1). For these, the correlation of anomalous differences between half data sets, CC ano 1/2 , was close to zero. Three data sets, 2_10, 2_14 and 2_15, were highly redundant and accounted for nearly half of the total observations in the 3.2-3.5 Å resolution range.
We used two strategies to generate merged and scaled Ric-8A data sets with the goal to extract sulfur anomalous differences of sufficient intensity and accuracy to reveal the sulfur substructure. The first of these, which employed the CCP4 program BLEND (Foadi et al., 2013), was to identify clusters of data sets that would potentially yield the most accurate anomalous signals by optimizing isomorphism among the included data sets. Then, in BLEND synthesis mode, these data sets were scaled and merged in AIMLESS (Evans & Murshudov, 2013; Table 2). We also employed phenix.scale_ and_merge  to process data-set clusters generated by BLEND without progressing though the synthesis stage. The second approach was to maximize redundancy by using all 18 data sets, employing phenix.scale_and_merge  to scale and weight individual data sets. At the same time, we were able to evaluate '-weighted versus local scaling algorithms applied by AIMLESS and phenix.scale_and_merge, respectively. Both BLEND and phenix.scaled-and-merged exclude non-isomorphous or radiation-damaged images that would degrade anomalous signals.
Merging and processing of multiple data sets by BLEND was based on pairwise comparison of individual data sets to develop a hierarchy of data-set clusters, which is represented as a dendrogram (Fig. 2) based on the similarity of unit-cell parameters. BLEND analysis identified three subclusters, characterized by aggregate values of the linear cell variation (LCV) parameter in the range 0.34-0.89%. In contrast, the LCV for the entire data set was 1.48%, which corresponds to a maximum variation of 2.3 Å in the diagonal distances of the three unit-cell faces among the crystals in the data set. Execution of BLEND in synthesis mode evokes AIMLESS to scale and merge data within each cluster (Table 2). Monotonic changes in scaling B factors, typically over the range from À5 to À10, was consistent with the absence of significant radiation decay. Relative to the entire data set, and apart from cluster 2 and the 1+2 supercluster, clustering did not result in a significant reduction in R meas or R p.i.m. despite the improvement in I/(I) for all but cluster 3, which includes many weak, highresolution data. Importantly, none of the clusters appear to exhibit strong anomalous signals, as estimated by the slope of the normal probability plot of ÁI ano /(ÁI ano ), where ÁI anom = I + À I À (Evans, 2011). Likewise, no improvement is observed in the correlation of anomalous differences between half data sets (CC ano 1/2 ), which is not statistically significant for any of the data-set clusters. Using the criterion described above, AIMLESS set the high-resolution limit of the entire 18-crystal  11983 † Statistics were generated using phenix.anomalous_signal. ‡ CC ano = hÁ ano Á ano,obs i/(hÁ 2 ano i 1/2 hÁ 2 ano,obs i 1/2 ), where Á ano are the ideal and Á ano,obs are the measured anomalous differences (F + À F À ). § CC ano 1/2 is the anomalous correlation coefficient between half data sets. } d 00 /sig is the anomalous signal strength computed using SHELXC. .092 (À0.217) † MD is the largest variation across the diagonal distances (D ab , D ac , D bc ) of the three unit-cell faces among data sets in the cluster. ‡ R meas = P hkl fNðhklÞ=½NðhklÞ À 1g 1=2 P i jI i ðhklÞ À hIðhklÞij= P hkl P i I i ðhklÞ, where I i (hkl) is the ith observation of the intensity of reflection hkl and hI(hkl)i is the mean over n observations. § R p.i.m. = P hkl f1=½NðhklÞ À 1g 1=2 P i jI i ðhklÞ À hIðhklÞij= P hkl P i I i ðhklÞ } CC 1/2 is the correlation coefficient on corresponding intensities between half data sets. † † Mid-slope of the anomalous normal probability plot of ÁI ano /(ÁI ano ), where ÁI anom = I + À I À (see Evans, 2011) ‡ ‡ CC ano 1/2 is the correlation coefficient between corresponding anomalous differences between half data sets. data set to 3.40 Å . The mid-slope of anomalous normal probability plot is 0.97, which is consistent with a marginal anomalous signal.
In contrast, when processed to a cutoff resolution of 3.24 Å with phenix_scale_and_merge, the data from clusters defined in BLEND and for the entire data set (Table 3) exhibited CC ano 1/2 values ranging from 0.29 to 0.74 and CC ano values ranging from 0.49 to 0.72. The strongest anomalous signals were those from cluster 1, which includes three of the four data sets with the highest CC ano values, and cluster 2, which includes the fourth (Table 1). Clusters composed of the largest number of data sets (1+2 and the set comprised of all data) exhibited the strongest anomalous correlation between half sets. We elected to retain a resolution limit of 3.4 Å for sulfur substructure calculations, in view of the observation that d 00 / sig for cluster 3, at 1.04, is near the useful limit. At a d 00 /sig of 1.45, the anomalous signal is much stronger for the full data set. However, due to the steep falloff in intensity with resolution, d 00 /sig falls to 1.2 at 3.27 Å and to 0.8 at 3.0 Å .

Sulfur substructure determination
Calculations to extract the positions of native anomalous scatterers were conducted with data sets processed using phenix_scale_and_merge, as these exhibited the strongest anomalous intensity differences. We attempted substructure determination using the phenix.hyss submodule. which employs both dual-space completion and log-likelihood-based completion methods. HySS was executed without automatic termination in brute-force mode. The log-likelihood gain (LLG) scores for clusters 1, 2 and 3 and the 1+2 supercluster were 191, 115, 106 and 217, respectively, with corresponding correlation coefficients of 0.089, 0.084, 0.078 and 0.070. From five to seven anomalous scatterers were identified from each of these data sets, and in each case one or two of these corresponded to a sulfur-atom position in the refined atomic model of Ric-8A within an error threshold of 2.0 Å . Operating on the entire data set, HySS identified eight anomalous scatterers, of which five corresponded to correct sulfur sites, with an LLG score of 308 and a correlation coefficient of 0.13. We executed the phenix.autosol procedure on the entire data set, enforcing the inclusion of all data to 3.4 Å resolution during execution of the HySS tool. In this instance, HySS identified 45 anomalous scatterers with an LLG score of 1460 and correlation coefficient of 0.31. Of these positions, 40 corresponded to Ric-8A sulfur atoms.
We then turned to SHELXC/D (Pape & Schneider, 2004;Sheldrick, 2010) to determine the anomalously scattering substructure of Ric-8A. Based on SHELXC analysis, data within the 3.4-3.6 Å range, for which h|Á ano |/(Á ano )i (d 00 /sig) ' 1.4, were set as the high-resolution shell for all clusters and for the full data set (Fig. 3a). A substructure search using SHELXD was performed to test a maximum of 10 000 trials. For each solution, SHELXD computes CC ano for all reflections (CC all ) and for a set composed of the weak reflections (CC weak ). In general, a bimodal distribution is expected for CC all /CC weak , in which correct, or nearly correct, substructure solutions form a cluster with relatively high values of C all / CC weak . Such a distribution was observed for the all-data cluster, cluster 1+2 and cluster 1, for which the highest-ranking solutions afforded CC all = 44.1, CC weak = 18.2, CC all = 43.2, CC weak = 17.1 and CC all = 41.5, CC weak = 17.3, respectively (Figs. 3b, 3e and 3f). The top-ranked solutions for cluster 1, cluster 1+2 and the all-data cluster, respectively, included 36, 34 and 36 correct sulfur positions. Solutions for clusters 2 and 3 found only three and one, respectively, of the correct sulfur sites. Remarkably, SHELXD yielded several correct solutions for the all-data cluster and cluster 1+2 within 100 trials (Fig. 3f ). SHELXC operating on the all-data cluster merged and scaled using AIMLESS (Table 2) extracted anomalous differences with a d 00 /sig of 0.6 at 3.4 Å , and subsequent execution of SHELXD yielded a monomodal distribution of CC all versus CC weak with maximum values of 25.0 and 9.6, respectively. Three of the 56 anomalous scattering sites corresponding to the latter highest-ranking solution corresponded to sulfur positions in the refined model. All of the clusters that afforded a correct solution included the large and highly redundant data sets 2_10, 2_14, 2_15 and 2_7, of which 2_7, 2_10 and 2_14 also exhibited relatively high CC ano values (Table 1) Data-set clusters generated using BLEND. The linear cell variation (LCV) is indicated for all 18 data sets and for each cluster.
Operating on the all-data cluster processed with phenix_ scale_and_merge, we employed the phase-retrieval program PRASA integrated in the CRANK2 suite to find the sulfur substructure of Ric-8A. PRASA conducted phasing trials at four high-resolution cutoffs ranging from 3.9 to 3.15 Å . The best solution emerged from refinement of solutions with a high-resolution cutoff of 3.4 Å , yielding a CC ano 1/2 (Karplus & Diederichs, 2012) of 25.9. Of the 40 S atoms in the asymmetric unit, PRASA correctly identified 35.

Structure determination
The structure of Ric-8A deposited as PDB entry 6nmg (Zeng et al., 2019) was determined using the anomalous phases derived from the positions of the 36 S atoms identified by SHELXC/D as described above. The correct hand of the substructure was identified by density modification in SHELXE. We used the ANOmalous DEnsity analysis program (ANODE) in the SHELXC/D/E suite to compute the phased anomalous peak heights corresponding to S atoms in the 3.4 Å resolution anomalous difference map. These values ranged from 5.8 to 16.9, with 29 sulfur atoms having values exceeding 8. Coordinates of the substructure atoms were submitted to the phenix.autosol pipeline for SAD phasing in Phaser (Adams et al., 2010;Terwilliger et al., 2009). Four additional sulfur atoms were located, yielding an overall figure of merit of 0.378 for the SAD phase set. Phasing and density-research papers modification calculations yielded a promising solution with R-factor, map skew and model-map cross-correlation values of 0.2473, 0.10 and 0.79, respectively. Visual inspection of the electron-density map using Coot (Emsley et al., 2010) shows continuous electron density corresponding to the predominant helical secondary structure of Ric-8A (Fig. 4a).
An initial model was constructed from the electron-density map computed with SAD phases from the sulfur substructure using the AutoBuild wizard (Terwilliger et al., 2008). Auto-Build was able to trace helical fragments accounting for 16% of the asymmetric unit. After removing questionable residues, the main chains of both Ric-8A molecules in the asymmetric unit were retraced manually in a -weighted 2mF o À DF c map at 3.4 Å resolution using Coot (Emsley et al., 2010), initially around the sulfur substructure sites and the Autobuild model. The initial phases were further refined and extended to 2.2 Å resolution using a native data set collected with X-rays of wavelength 0.979 Å (Fig. 4c). Fragments of additional main chain were constructed after iterative manual model rebuilding and refinement with the phenix.refine tool (Afonine et al., 2012). The registry of the sequence with respect to electron density was determined from the residues around the sulfur sites or bulky residues in both chains. NCS refinement was abandoned after the first few refinement cycles since the two molecules in the asymmetric unit (r.m.s.d. on C atoms of 0.718 Å between chains A and B) exhibited positional differences of >3 Å between corresponding C atoms and because several loop regions were disordered in chain B. An anomalous difference map computed with phases from the final model confirmed the 40 sulfur sites revealed in the anomalous sulfur substructure corresponding to nine methionine and nine cysteine residues from each of the two Ric-8A molecules in the asymmetric unit (Fig. 5). Four of the sulfur atoms in the substructure corresponded to sulfate ions derived from the crystallization buffer. Met426 was not located, possibly due to its flexibility in the structure. The final refinement statistics, indices of model quality and a description of the molecular architecture of Ric-8A and its relation to biological function are reported in Zeng et al. (2019).

Conclusions
With the advent of powerful beamlines and advanced phasing algorithms that combine sophisticated Patterson search procedures, direct methods and maximum-likelihood methods, experimental phasing using the anomalous intensity differences from native sulfur atoms has become routine. However, the method can present challenges for relatively small, moderately diffracting crystals that harbor large asymmetric units. Here, we have described our experience in the application of sulfur SAD phasing to determination of the structure of the G-protein-binding domain of Ric-8A, a 51 kDa protein that crystallized in an orthorhombic space group with two molecules in the asymmetric unit.
Crucial to the success of this project was the use of the NSLS II FMX (17-2) beamline as an X-ray source. Important attributes that contributed to accurate determination of  -Weighted 2mF o À DF c Ric-8A electron-density maps at successive stages in the phasing procedure. 3.4 Å resolution electron-density maps computed with SAD phases from the anomalous scattering substructure corresponding to the highest ranking solution determined by SHELXD using all 18 scaled and merged data sets are shown before (a) and after (b) density modification by solvent flattening. (c) -Weighted mF o À DF c electron-density map computed with native data measured at wavelengths of 0.979-2.2 Å with phases calculated from the final refined model. The refined Ric-8A model is shown in stick mode (C atoms in dark purple); electron-density maps were contoured at 1.5. anomalous intensity differences included an exceptionally high flux microfocus beam and a precision goniometer to position crystals within the 10 mm beam diameter, affording data collection in helical mode to minimize radiation decay. The helium-filled beam path and readout from the fast EIGER 16M pixel-array detector allowed the rapid recording of diffraction intensities with a minimum of air scatter.
We implemented alternative scaling/merging and substructure-search strategies encoded in publicly available program suites. In so doing, we approached the problem from the perspective of a routine user, making no attempt to modify existing software and, in most instances, did not explore program capabilities beyond those accessible though default options. Within this framework, we offer several observations that may be of use to other researchers who embark on S-SAD phasing of large asymmetric units of less-than-ideal crystals.
Firstly, high data redundancy is essential. This is a well recognized criterion for successful phase determination by S-SAD . In the case of Ric-8A crystals, the multiplicity afforded by over eleven million intensity observations from 18 crystals, several of which were scanned over ten or more 2 rotations about the ' axis, proved to be critical. Importantly, the combination of multiplicity and unit-cell isomorphism proved to be decisive in defining the sulfur substructure. All of the data-set combinations that afforded the correct anomalously scattering substructure included the largest data sets from three highly isomorphous crystals that were aggregated in the same BLEND cluster. Secondly, we found that local scaling, as implemented in phenix_scale_ and_merge, appeared to be more effective in extracting significant anomalous differences than the weighted '-scaling implemented in AIMLESS. Finally, in our hands, it was possible to retrieve the anomalous scattering substructure using SHELXD, but not with the phenix.hyss tool.
To extract the anomalous substructure from crystals of Ric-8A, we collected X-ray data at 7 keV (1.7712 Å ), at which the sulfur anomalous signal is significant while absorbance is manageable. However, modeling and experimental studies indicate that with a V-shaped detector geometry data collection at longer wavelengths approaching 3 Å , where f 00 is stronger, is advantageous if the cross section of the crystal and the surrounding cryoprotectant is low, in the neighborhood of 100 mm or less, where the effects of photon absorption are relatively low Basu et al., 2019). Beamlines BL-1A at the Photon Factory and I23 at Diamond Light Source are able to achieve such wavelengths, and several successful structure determinations of challenging targets using these facilities have recently been reported (Bent et al., 2016;Parker & Newstead, 2017;Basu et al., 2019). The FMX beamline can access wavelengths to 5 keV (2.48 Å ), and we speculate that data collected at this energy might have reduced the requirement for high data multiplicity SAD phasing of Ric-8A crystals by virtue of the higher anomalous signal to noise that would be afforded at a wavelength closer to the sulfur K edge. Indeed, data sets collected with less than tenfold multiplicity on the I23 beamline at wavelengths ranging from 3.09 to 4.96 Å have led to successful structure determinations by native SAD phasing (Aurelius et al., 2017;Langan et al., 2018;Bent et al., 2016).